Overview

Dataset statistics

Number of variables12
Number of observations2968
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.4 KiB
Average record size in memory104.0 B

Variable types

Numeric12

Alerts

gross_revenue is highly overall correlated with qtde_invoice and 3 other fieldsHigh correlation
recency_days is highly overall correlated with qtde_invoiceHigh correlation
qtde_invoice is highly overall correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_items is highly overall correlated with gross_revenue and 4 other fieldsHigh correlation
qtde_products is highly overall correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly overall correlated with gross_revenue and 4 other fieldsHigh correlation
avg_unique_basket_size is highly overall correlated with qtde_products and 1 other fieldsHigh correlation
freq is highly overall correlated with avg_rec_daysHigh correlation
avg_ticket is highly overall correlated with avg_basket_size and 1 other fieldsHigh correlation
avg_rec_days is highly overall correlated with freqHigh correlation
qtde_returns is highly overall correlated with qtde_items and 2 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 25.15677718)Skewed
qtde_returns is highly skewed (γ1 = 26.84619429)Skewed
customer_id has unique valuesUnique
recency_days has 33 (1.1%) zerosZeros
qtde_returns has 1481 (49.9%) zerosZeros

Reproduction

Analysis started2022-12-20 12:52:39.945868
Analysis finished2022-12-20 12:53:15.696102
Duration35.75 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

customer_id
Real number (ℝ)

Distinct2968
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.377
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:15.876165image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.35
Q113798.75
median15220.5
Q316768.5
95-th percentile17964.65
Maximum18287
Range5940
Interquartile range (IQR)2969.75

Descriptive statistics

Standard deviation1719.1445
Coefficient of variation (CV)0.11258036
Kurtosis-1.2061782
Mean15270.377
Median Absolute Deviation (MAD)1489
Skewness0.032193711
Sum45322479
Variance2955457.9
MonotonicityNot monotonic
2022-12-20T09:53:16.075967image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17850 1
 
< 0.1%
12670 1
 
< 0.1%
17734 1
 
< 0.1%
14905 1
 
< 0.1%
16103 1
 
< 0.1%
14626 1
 
< 0.1%
14868 1
 
< 0.1%
18246 1
 
< 0.1%
17115 1
 
< 0.1%
16611 1
 
< 0.1%
Other values (2958) 2958
99.7%
ValueCountFrequency (%)
12347 1
< 0.1%
12348 1
< 0.1%
12352 1
< 0.1%
12356 1
< 0.1%
12358 1
< 0.1%
12359 1
< 0.1%
12360 1
< 0.1%
12362 1
< 0.1%
12364 1
< 0.1%
12370 1
< 0.1%
ValueCountFrequency (%)
18287 1
< 0.1%
18283 1
< 0.1%
18282 1
< 0.1%
18277 1
< 0.1%
18276 1
< 0.1%
18274 1
< 0.1%
18273 1
< 0.1%
18272 1
< 0.1%
18270 1
< 0.1%
18269 1
< 0.1%

gross_revenue
Real number (ℝ)

Distinct2953
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2691.263
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:16.296047image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.7325
Q1570.845
median1083.905
Q32306.905
95-th percentile7169.562
Maximum279138.02
Range279131.82
Interquartile range (IQR)1736.06

Descriptive statistics

Standard deviation10113.975
Coefficient of variation (CV)3.7580775
Kurtosis399.25289
Mean2691.263
Median Absolute Deviation (MAD)671.49
Skewness17.664809
Sum7987668.7
Variance1.0229249 × 108
MonotonicityNot monotonic
2022-12-20T09:53:16.475903image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1078.96 2
 
0.1%
2053.02 2
 
0.1%
331 2
 
0.1%
1353.74 2
 
0.1%
889.93 2
 
0.1%
745.06 2
 
0.1%
379.65 2
 
0.1%
2092.32 2
 
0.1%
731.9 2
 
0.1%
734.94 2
 
0.1%
Other values (2943) 2948
99.3%
ValueCountFrequency (%)
6.2 1
< 0.1%
13.3 1
< 0.1%
15 1
< 0.1%
36.56 1
< 0.1%
45 1
< 0.1%
52 1
< 0.1%
52.2 1
< 0.1%
52.2 1
< 0.1%
62.43 1
< 0.1%
68.84 1
< 0.1%
ValueCountFrequency (%)
279138.02 1
< 0.1%
259657.3 1
< 0.1%
194550.79 1
< 0.1%
136263.72 1
< 0.1%
124564.53 1
< 0.1%
116725.63 1
< 0.1%
91062.38 1
< 0.1%
72882.09 1
< 0.1%
66653.56 1
< 0.1%
65019.62 1
< 0.1%

recency_days
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.31031
Minimum0
Maximum373
Zeros33
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:16.675880image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.760314
Coefficient of variation (CV)1.2091423
Kurtosis2.7765932
Mean64.31031
Median Absolute Deviation (MAD)26
Skewness1.7980702
Sum190873
Variance6046.6664
MonotonicityNot monotonic
2022-12-20T09:53:16.866079image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 99
 
3.3%
4 87
 
2.9%
2 85
 
2.9%
3 85
 
2.9%
8 76
 
2.6%
10 67
 
2.3%
9 66
 
2.2%
7 66
 
2.2%
17 64
 
2.2%
22 55
 
1.9%
Other values (262) 2218
74.7%
ValueCountFrequency (%)
0 33
 
1.1%
1 99
3.3%
2 85
2.9%
3 85
2.9%
4 87
2.9%
5 43
1.4%
7 66
2.2%
8 76
2.6%
9 66
2.2%
10 67
2.3%
ValueCountFrequency (%)
373 2
0.1%
372 4
0.1%
371 1
 
< 0.1%
368 1
 
< 0.1%
366 4
0.1%
365 2
0.1%
364 1
 
< 0.1%
360 1
 
< 0.1%
359 1
 
< 0.1%
358 4
0.1%

qtde_invoice
Real number (ℝ)

Distinct57
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7230458
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:17.065808image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.8485429
Coefficient of variation (CV)1.5461248
Kurtosis189.99723
Mean5.7230458
Median Absolute Deviation (MAD)2
Skewness10.741904
Sum16986
Variance78.296712
MonotonicityNot monotonic
2022-12-20T09:53:17.245868image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 785
26.4%
3 498
16.8%
4 393
13.2%
5 237
 
8.0%
1 190
 
6.4%
6 173
 
5.8%
7 138
 
4.6%
8 98
 
3.3%
9 70
 
2.4%
11 54
 
1.8%
Other values (47) 332
11.2%
ValueCountFrequency (%)
1 190
 
6.4%
2 785
26.4%
3 498
16.8%
4 393
13.2%
5 237
 
8.0%
6 173
 
5.8%
7 138
 
4.6%
8 98
 
3.3%
9 70
 
2.4%
10 54
 
1.8%
ValueCountFrequency (%)
206 1
< 0.1%
198 1
< 0.1%
124 1
< 0.1%
97 1
< 0.1%
91 2
0.1%
86 1
< 0.1%
72 1
< 0.1%
62 2
0.1%
60 1
< 0.1%
57 1
< 0.1%

qtde_items
Real number (ℝ)

Distinct1669
Distinct (%)56.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1579.6698
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:17.466099image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101.35
Q1296
median638
Q31398.25
95-th percentile4403.25
Maximum196844
Range196843
Interquartile range (IQR)1102.25

Descriptive statistics

Standard deviation5700.0984
Coefficient of variation (CV)3.6084113
Kurtosis518.2266
Mean1579.6698
Median Absolute Deviation (MAD)419
Skewness18.7615
Sum4688460
Variance32491122
MonotonicityNot monotonic
2022-12-20T09:53:17.675884image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
310 11
 
0.4%
150 9
 
0.3%
88 9
 
0.3%
134 8
 
0.3%
272 8
 
0.3%
288 8
 
0.3%
84 8
 
0.3%
260 8
 
0.3%
246 8
 
0.3%
200 7
 
0.2%
Other values (1659) 2884
97.2%
ValueCountFrequency (%)
1 1
< 0.1%
2 2
0.1%
12 2
0.1%
16 1
< 0.1%
17 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
23 1
< 0.1%
25 1
< 0.1%
ValueCountFrequency (%)
196844 1
< 0.1%
79879 1
< 0.1%
77373 1
< 0.1%
69993 1
< 0.1%
64549 1
< 0.1%
64124 1
< 0.1%
62812 1
< 0.1%
58243 1
< 0.1%
57772 1
< 0.1%
50255 1
< 0.1%

qtde_products
Real number (ℝ)

Distinct469
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.7035
Minimum1
Maximum7837
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:17.875883image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7837
Range7836
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.2812
Coefficient of variation (CV)2.1945681
Kurtosis354.34185
Mean122.7035
Median Absolute Deviation (MAD)44
Skewness15.67683
Sum364184
Variance72512.365
MonotonicityNot monotonic
2022-12-20T09:53:18.076261image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28 46
 
1.5%
20 38
 
1.3%
35 35
 
1.2%
15 33
 
1.1%
29 32
 
1.1%
19 32
 
1.1%
11 32
 
1.1%
26 31
 
1.0%
18 30
 
1.0%
27 30
 
1.0%
Other values (459) 2629
88.6%
ValueCountFrequency (%)
1 6
 
0.2%
2 14
0.5%
3 15
0.5%
4 17
0.6%
5 26
0.9%
6 29
1.0%
7 18
0.6%
8 19
0.6%
9 27
0.9%
10 27
0.9%
ValueCountFrequency (%)
7837 1
< 0.1%
5586 1
< 0.1%
5095 1
< 0.1%
4577 1
< 0.1%
2698 1
< 0.1%
2379 1
< 0.1%
2060 1
< 0.1%
1818 1
< 0.1%
1673 1
< 0.1%
1636 1
< 0.1%

avg_basket_size
Real number (ℝ)

Distinct1973
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean235.83169
Minimum1
Maximum6009.3333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:18.276193image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.2375
median172
Q3281.375
95-th percentile598.345
Maximum6009.3333
Range6008.3333
Interquartile range (IQR)178.1375

Descriptive statistics

Standard deviation283.87459
Coefficient of variation (CV)1.2037169
Kurtosis102.86589
Mean235.83169
Median Absolute Deviation (MAD)82.625
Skewness7.7098453
Sum699948.45
Variance80584.784
MonotonicityNot monotonic
2022-12-20T09:53:18.455984image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 11
 
0.4%
114 10
 
0.3%
82 9
 
0.3%
73 9
 
0.3%
86 9
 
0.3%
75 8
 
0.3%
136 8
 
0.3%
140 8
 
0.3%
60 8
 
0.3%
88 8
 
0.3%
Other values (1963) 2880
97.0%
ValueCountFrequency (%)
1 2
0.1%
2 1
< 0.1%
3.333333333 1
< 0.1%
5.333333333 1
< 0.1%
5.666666667 1
< 0.1%
6.142857143 1
< 0.1%
7.5 1
< 0.1%
9 1
< 0.1%
9.5 1
< 0.1%
11 1
< 0.1%
ValueCountFrequency (%)
6009.333333 1
< 0.1%
4282 1
< 0.1%
3906 1
< 0.1%
3868.65 1
< 0.1%
2880 1
< 0.1%
2801 1
< 0.1%
2733.944444 1
< 0.1%
2518.769231 1
< 0.1%
2160.333333 1
< 0.1%
2082.225806 1
< 0.1%

avg_unique_basket_size
Real number (ℝ)

Distinct1009
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.158128
Minimum1
Maximum299.70588
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:18.666129image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.4021531
Q110
median17.2
Q327.75
95-th percentile56.9475
Maximum299.70588
Range298.70588
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.512566
Coefficient of variation (CV)0.88060535
Kurtosis27.705366
Mean22.158128
Median Absolute Deviation (MAD)8.2
Skewness3.4995256
Sum65765.325
Variance380.74025
MonotonicityNot monotonic
2022-12-20T09:53:18.845772image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13 54
 
1.8%
14 40
 
1.3%
11 38
 
1.3%
9 33
 
1.1%
18 33
 
1.1%
1 32
 
1.1%
20 31
 
1.0%
10 30
 
1.0%
16 29
 
1.0%
17 28
 
0.9%
Other values (999) 2620
88.3%
ValueCountFrequency (%)
1 32
1.1%
1.2 1
 
< 0.1%
1.25 1
 
< 0.1%
1.333333333 2
 
0.1%
1.5 7
 
0.2%
1.568181818 1
 
< 0.1%
1.571428571 1
 
< 0.1%
1.666666667 4
 
0.1%
1.833333333 1
 
< 0.1%
2 24
0.8%
ValueCountFrequency (%)
299.7058824 1
< 0.1%
259 1
< 0.1%
203.5 1
< 0.1%
148 1
< 0.1%
145 1
< 0.1%
136.125 1
< 0.1%
135.5 1
< 0.1%
127 1
< 0.1%
122 1
< 0.1%
118 1
< 0.1%

freq
Real number (ℝ)

Distinct1348
Distinct (%)45.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.063285225
Minimum0.0054495913
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:19.040797image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.0054495913
5-th percentile0.0094339623
Q10.017777778
median0.029411765
Q30.055440135
95-th percentile0.22222222
Maximum3
Range2.9945504
Interquartile range (IQR)0.037662358

Descriptive statistics

Standard deviation0.13449713
Coefficient of variation (CV)2.1252533
Kurtosis121.53966
Mean0.063285225
Median Absolute Deviation (MAD)0.01432783
Skewness8.7727252
Sum187.83055
Variance0.018089479
MonotonicityNot monotonic
2022-12-20T09:53:19.221205image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.3333333333 21
 
0.7%
0.1666666667 21
 
0.7%
0.02777777778 20
 
0.7%
0.09090909091 19
 
0.6%
0.0625 17
 
0.6%
0.4 16
 
0.5%
0.1333333333 16
 
0.5%
0.03571428571 15
 
0.5%
0.02380952381 15
 
0.5%
0.25 15
 
0.5%
Other values (1338) 2793
94.1%
ValueCountFrequency (%)
0.005449591281 1
 
< 0.1%
0.005464480874 1
 
< 0.1%
0.005494505495 1
 
< 0.1%
0.005509641873 1
 
< 0.1%
0.005586592179 2
0.1%
0.005602240896 1
 
< 0.1%
0.005617977528 2
0.1%
0.00566572238 1
 
< 0.1%
0.005681818182 2
0.1%
0.005698005698 3
0.1%
ValueCountFrequency (%)
3 1
 
< 0.1%
2 1
 
< 0.1%
1.571428571 1
 
< 0.1%
1.5 3
 
0.1%
1 14
0.5%
0.8333333333 1
 
< 0.1%
0.75 1
 
< 0.1%
0.6666666667 12
0.4%
0.6487935657 1
 
< 0.1%
0.6 1
 
< 0.1%

avg_ticket
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct2965
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.989799
Minimum2.1505882
Maximum4453.43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:19.415965image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2.1505882
5-th percentile4.915888
Q113.118111
median17.936066
Q324.882907
95-th percentile90.052125
Maximum4453.43
Range4451.2794
Interquartile range (IQR)11.764796

Descriptive statistics

Standard deviation119.53254
Coefficient of variation (CV)3.6233182
Kurtosis812.9555
Mean32.989799
Median Absolute Deviation (MAD)5.9616373
Skewness25.156777
Sum97913.724
Variance14288.028
MonotonicityNot monotonic
2022-12-20T09:53:19.606087image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 2
 
0.1%
4.162 2
 
0.1%
14.47833333 2
 
0.1%
18.15222222 1
 
< 0.1%
13.92736842 1
 
< 0.1%
36.24411765 1
 
< 0.1%
29.78416667 1
 
< 0.1%
22.8792623 1
 
< 0.1%
20.51104167 1
 
< 0.1%
149.025 1
 
< 0.1%
Other values (2955) 2955
99.6%
ValueCountFrequency (%)
2.150588235 1
< 0.1%
2.4325 1
< 0.1%
2.462371134 1
< 0.1%
2.511241379 1
< 0.1%
2.515333333 1
< 0.1%
2.65 1
< 0.1%
2.656931818 1
< 0.1%
2.707598253 1
< 0.1%
2.760621572 1
< 0.1%
2.770464191 1
< 0.1%
ValueCountFrequency (%)
4453.43 1
< 0.1%
3202.92 1
< 0.1%
1687.2 1
< 0.1%
952.9875 1
< 0.1%
872.13 1
< 0.1%
841.0214493 1
< 0.1%
651.1683333 1
< 0.1%
640 1
< 0.1%
624.4 1
< 0.1%
615.75 1
< 0.1%

avg_rec_days
Real number (ℝ)

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.305053
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:19.796113image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.927198
median48.267857
Q385.333333
95-th percentile200.65
Maximum366
Range365
Interquartile range (IQR)59.406136

Descriptive statistics

Standard deviation63.503259
Coefficient of variation (CV)0.94351399
Kurtosis4.9086453
Mean67.305053
Median Absolute Deviation (MAD)26.267857
Skewness2.0662224
Sum199761.4
Variance4032.6639
MonotonicityNot monotonic
2022-12-20T09:53:19.976265image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14 25
 
0.8%
4 22
 
0.7%
70 21
 
0.7%
7 20
 
0.7%
35 19
 
0.6%
49 18
 
0.6%
11 17
 
0.6%
46 17
 
0.6%
21 17
 
0.6%
28 16
 
0.5%
Other values (1248) 2776
93.5%
ValueCountFrequency (%)
1 16
0.5%
1.5 1
 
< 0.1%
2 13
0.4%
2.5 1
 
< 0.1%
2.601398601 1
 
< 0.1%
3 15
0.5%
3.321428571 1
 
< 0.1%
3.330357143 1
 
< 0.1%
3.5 2
 
0.1%
4 22
0.7%
ValueCountFrequency (%)
366 1
 
< 0.1%
365 1
 
< 0.1%
363 1
 
< 0.1%
362 1
 
< 0.1%
357 2
0.1%
356 1
 
< 0.1%
355 2
0.1%
352 1
 
< 0.1%
351 2
0.1%
350 3
0.1%

qtde_returns
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct172
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.984164
Minimum0
Maximum9014
Zeros1481
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size46.4 KiB
2022-12-20T09:53:20.186220image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q36
95-th percentile62.65
Maximum9014
Range9014
Interquartile range (IQR)6

Descriptive statistics

Standard deviation228.65147
Coefficient of variation (CV)9.1518557
Kurtosis912.13278
Mean24.984164
Median Absolute Deviation (MAD)1
Skewness26.846194
Sum74153
Variance52281.494
MonotonicityNot monotonic
2022-12-20T09:53:20.375889image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1481
49.9%
1 295
 
9.9%
3 169
 
5.7%
6 93
 
3.1%
2 87
 
2.9%
4 71
 
2.4%
5 43
 
1.4%
12 43
 
1.4%
8 40
 
1.3%
7 38
 
1.3%
Other values (162) 608
20.5%
ValueCountFrequency (%)
0 1481
49.9%
1 295
 
9.9%
2 87
 
2.9%
3 169
 
5.7%
4 71
 
2.4%
5 43
 
1.4%
6 93
 
3.1%
7 38
 
1.3%
8 40
 
1.3%
9 36
 
1.2%
ValueCountFrequency (%)
9014 1
< 0.1%
4824 1
< 0.1%
4027 1
< 0.1%
2302 2
0.1%
1776 1
< 0.1%
1608 1
< 0.1%
1589 1
< 0.1%
1515 1
< 0.1%
1278 1
< 0.1%
1242 1
< 0.1%

Interactions

2022-12-20T09:53:13.006027image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:47.075042image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:49.315914image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:51.466337image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:53.806180image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:56.226086image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:58.455926image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:00.676783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:03.556020image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:06.205822image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:08.426166image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:10.467579image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:13.168685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:47.358491image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:49.475938image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:51.651054image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:53.966135image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:56.436041image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:58.630963image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:00.846127image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:03.745876image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:06.456317image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:08.586185image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:10.755776image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:13.346236image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:47.545152image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:49.665975image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:51.820964image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:54.136280image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:56.615911image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:58.816220image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:01.046098image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:03.915928image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:06.666090image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:08.756146image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:10.975987image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:13.516094image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:47.736129image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:49.856060image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:52.026299image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:54.325737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:56.806068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:59.006309image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:01.245838image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:04.146085image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:06.876067image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:08.916097image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:11.215864image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:13.676198image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:47.906208image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:50.026386image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:52.196080image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:54.486066image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:56.970914image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:59.168445image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:01.415833image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:04.336027image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:07.036359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:09.076154image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:11.445836image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:13.868887image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:48.096040image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:50.216145image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:52.446164image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:54.721102image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:57.156223image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:59.368672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:01.635850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:04.515985image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:07.226370image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:09.246083image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:11.686148image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:14.035832image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:48.276109image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:50.396098image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:52.666279image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:55.005845image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:57.345929image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:59.568604image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:02.335814image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:04.705902image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:07.416244image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:09.430736image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:11.905915image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:14.206299image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:48.446298image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:50.576127image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:52.846399image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:55.295769image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:57.526222image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:59.746236image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:02.526250image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:04.872283image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:07.586141image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:09.596152image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:12.075969image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:14.386044image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:48.636019image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:50.746159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:53.046121image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:55.541145image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:57.726046image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:59.945999image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:02.750781image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:05.105901image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:07.756351image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:09.766676image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:12.256030image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:14.545821image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:48.806321image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:50.946157image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:53.245806image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:55.735858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:57.906343image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:00.146062image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:02.946048image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:05.326218image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:07.925960image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:09.940799image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:12.436194image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:14.706306image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:48.976167image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:51.126278image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:53.456151image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:55.896035image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:58.076338image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:00.315830image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:03.175949image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:05.566069image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:08.086287image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:10.096086image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:12.605940image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:14.886190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:49.155910image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:51.306209image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:53.635861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:56.056104image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:52:58.267593image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:00.496174image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:03.396092image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:05.872651image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:08.256187image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:10.295836image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-12-20T09:53:12.796125image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-12-20T09:53:20.606063image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-20T09:53:20.895997image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-20T09:53:21.200769image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-20T09:53:21.556136image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-20T09:53:21.877104image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-20T09:53:15.176059image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-20T09:53:15.485856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

customer_idgross_revenuerecency_daysqtde_invoiceqtde_itemsqtde_productsavg_basket_sizeavg_unique_basket_sizefreqavg_ticketavg_rec_daysqtde_returns
017850.05391.21372.034.01733.0297.050.9705888.7352940.48611118.15222235.50000021.0
113047.03232.5956.09.01390.0171.0154.44444419.0000000.04878018.90403527.2500006.0
212583.06705.382.015.05028.0232.0335.20000015.4666670.04569928.90250023.18750050.0
313748.0948.2595.05.0439.028.087.8000005.6000000.01792133.86607192.6666670.0
415100.0876.00333.03.080.03.026.6666671.0000000.136364292.0000008.60000022.0
515291.04623.3025.014.02102.0102.0150.1428577.2857140.05444145.32647123.20000027.0
614688.05630.877.021.03621.0327.0172.42857115.5714290.07356917.21978618.300000281.0
717809.05411.9116.012.02057.061.0171.4166675.0833330.03910688.71983635.70000041.0
815311.060767.900.091.038194.02379.0419.71428626.1428570.31550825.5434644.144444231.0
916098.02005.6387.07.0613.067.087.5714299.5714290.02439029.93477647.6666670.0
customer_idgross_revenuerecency_daysqtde_invoiceqtde_itemsqtde_productsavg_basket_sizeavg_unique_basket_sizefreqavg_ticketavg_rec_daysqtde_returns
562617727.01060.2515.01.0645.066.0645.00000066.00.28571416.0643946.06.0
563617232.0421.522.02.0203.036.0101.50000018.00.15384611.70888912.00.0
563717468.0137.0010.02.0116.05.058.0000002.50.40000027.4000004.00.0
564813596.0697.045.02.0406.0166.0203.00000083.00.2500004.1990367.00.0
565414893.01237.859.02.0799.073.0399.50000036.50.66666716.9568492.00.0
565812479.0473.2011.01.0382.030.0382.00000030.00.33333315.7733334.034.0
567914126.0706.137.03.0508.015.0169.3333335.01.00000047.0753333.050.0
568513521.01092.391.03.0733.0435.0244.333333145.00.3000002.5112414.50.0
569515060.0301.848.04.0262.0120.065.50000030.02.0000002.5153331.00.0
571412558.0269.967.01.0196.011.0196.00000011.00.28571424.5418186.0102.0